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ABSTRACT 

The development of a series of parallel single-topic 
tests for testing attainment of 14 objectives concerned with inquiry 
skill in biology is discussed. The series of eight two— part tests are 
called "Explorations in Biology' 1 (EIB) 4 (CK) 
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Criterion-referenced Tests in Biology ** 



Eugenia M. Koos and James Y. Chan 
Mid-Continent Regional Educational laboratory 

Introduction 



This report deals with the development of a series of parallel single-topic tests 
developed by McREL staff members to test attainment of fourteen objectives concerned 
with inquiry skill in biology. These objectives were identified as being accessible to 
assessment by means of paper-and-pencil multiple-choice tests. The series of eight 
two-part tests are called EXPLORATIONS IN BIOLOGY (EIB hereafter). The single- 
topic, simulation format was selected to accommodate the unitary nature of an inquiry. 
The data resulting from field trials of the six EIB Topics available in 1971 were made 
possible through the cooperation of a number of schools and colleges interested in aid- 
ing in the development of a measure by which effectiveness of instruction in inquiry 
processes might be assessed. Over 1500 students in private and public schools have 
taken these tests during the development process begun three years ago; states involved 
were Connecticut, Illinois, Hawaii, Louisiana, Missouri, Nebraska and Pennsylvania. 



Objectives were selected for the EIB based on studies by Burmester*, Kaplan^, 
Suchman^ and Taba^. With the completion of the detailed McREL-BSCS set of objec- 
tives®, studies were made to learn the extent to which EIB items could be referenced 
to similar objectives listed in that document (IOTB hereafter). 



Items at several difficulty levels could be written to test the objectives to which 
the EIB is referenced. This particular series is intended for the average student in 
the first course in high school biology, usually offered in the sophomore year. Some 
of this target group taking the EIB's as a pre-test in the fall may demonstrate attain- 
ment of a particular criterion level or score set by the teacher. There would then be 
no need for these particular students to take instruction intended to guide the target 
group toward this level. A need is implied by this contingency, of course, for 
EIB tests to be written to test objectives at higher levels of complexity. 



1. Burmester, Mary Alice. Behavior involved in the critical aspects of scientific 
thinking. Science Education, December, 1952, 36, 259-263 

2. Kaplan, E. H. The Burmester Test of aspects of scientific thinking as a means 

of teaching the mechanics of the scientific method. Science Education , October, 1967, 
51 (4), 353-357. 

3. Suchman, J. R. The Elementary School Training Program in Scientific Inquiry. 
Project 216, USOE, University of Illinois, June, 1962. 

4. Taba, Hilda. Teaching strategy and learning. California Journal of Instructional 
Improvement. December, 1963. 

5. Bingman, R. M. , Ed. Inquiry Objectives in the Teaching of Biology. Kansas Citys 
McREL, 1969. 
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The next major objective, however, is the solution of problems of lack of equiva- 
lence of various EIB Topics, then studies of effectiveness of objective-referenced 
items pooled from several Topics. 

Characteristics of Items 

Items were written by educators and test construction specialists familiar with 
lOth-grade biology curricula, and with inquiry processes. The number of response 
options (two to five) offered for each item differs from section to section. All are 
uniform in that a decision based on what has been presented in the tes| to that point 
is to be made. The vocabulary level is generally ninth to tenth grade”. 

To check appropriateness of these items for the target group, a wide range of 
students was sampled, including ninth-graders and college freshmen. The overlap 
in scores between the high school level and the college level led to the inclusion of 
college freshmen in the target group . It was, however, concluded that the EIB 
is not appropriate for any ninth-grade group other than those identified as having 
above-average aptitude or achievement. 

Two of the usual methods of item analysis are being applied to Topics admin- 
istered last spring. First, if two options appeared equally attractive to most 
students, the item is revised to yield a more clear-cut appeal by the "correct" 
response option. Secondly, if an item showed negative discrimination, it is re- 
vised or eliminated and another substituted. 

l 

Items responded to correctly by only a few students at the end of the term are 
also revised, on the assumption that a sizeable proportion of the total sample had 
had effective instruction in inquiry processes during the year. 

From comparisons of percentages of students selecting "correct" answers in 
the fall prior to instruction and later on at the end of the term, it will be possible 
to identify those items most sensitive to effects of instruction; the extent to which 
such items are indicators of effect of inquiry instruction can be reflected by the 
scoring weight attached to the desired response on these items. 



6. Dale, E. and Chall, JeanneS. A formula for predicting readability. In 
Hunicutt, C. W. and Iverson, W. J. Research in the Three R*s. New York: 
Harper and Brothers, 1958, 194-213. 
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Characteristics of the Scoring System 

Because the scoring of responses is based on the judgment of test writers as to 
the preferred inquiry process response, and not on answers that are related to memory 
for facts or understanding of concepts, a weighted scoring system appeared desirable. 
This resulted in some variability among the total scores possible for the various book- 
lets. Maximum possible scores for the two parts of each Topic (planning and impie- 
mentation of the investigation) as well as scores that might be attained by chance are 
shown in Table 1. The latter might prove useful to the teacher in the setting of the 
criterion score to be attained after instruction. 

- ' Table 1 

Maximum Score Possible and Chance Score for EIB 1-6 



EIB 


Maximuni Possible Score 


Chance Score 


1-A 


72 


35.36 


1-B 


89 


38.80 


2-A 


75 


36.70 


2-B 


82 


35.45 


3-A 


77 


33.84 


3-B 


78 


34.45 


4 -A 


78 


35.14 


4-B 


55 


24.20 


5-A 


66 


31.40 


5-B 


113 


51.62 


6-A 


87 


37.70 


i G-B 


54 


24.90 


K 
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Concurrent Validity 

f i. The relationship between total EIB-1 scores and multiple rating by peers ? 
of’ teachers (studied separately) on the Behavioral Checklist for Scie nce Students 
was studied based on the rationale that classroom and/or school behaviors indicat- 
ing high interest and skill in discussion of scientific inquiry should be positively 
related to EIB total score. This was not the case in the results from application 
of a Pearson product- moment correlation formula to the scores; a zero-order jr 
resulted. Because the scattergram appeared to suggest a curvilinear relation- 
ship, this correlation was computed and found to be somewhat higher (.36 between 
average peer rating and EIB-1 (N = 150); . 30 between rating by teacher and EIB-1). 
Neither statistic is convincing evidence of concurrent validity for EIB-1. 

7. Koos, Eugenia M. andAfsar, Sibel. 

Kansas City: McREL, 1968. 
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A Behavioral Checklist for Science Students . 
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